1 Introduction

FastTrack is a (nearly) automated formant tracking tool that enables you to extract the formant contours quickly and easily. The tool functions based on a number of praat scripts and as a plug-in in Praat, so there is no need for you to code anything. There is also room for customising the scripts to make more individualised tool. FastTrack is developed by Santiago Barreda at UC Davies and more information can be found here: https://github.com/santiagobarreda/FastTrack

Also, the following paper explains the tool in more detail:

Barreda, S. (2021). Fast Track: fast (nearly) automatic formant-tracking using Praat. Linguistics Vanguard, 7(1), 20200051. https://doi.org/10.1515/lingvan-2020-0051

1.1 Workflow overview

In the workshop, I will demonstrate my typical workflow of acoustic (spectral) analysis using FastTrack. I usually follow these steps:

  1. Record audio data (well, quite obvious)

  2. Annotate audio files onto Praat Textgrid

  • I usually start this using forced alignment (e.g., Montreal Forced Aligner: MFA) and then refine segment boundaries manually.
  • I use MFA because it can annotate segments using ARPABET that FastTrack can recognise.
  • Helpful to create two tiers at least: word (orthography) and phone (ARPABET). You can manually annotate words first, and then MFA will create the phone tier for you.
  1. Define segments of interest
  • FastTrack starts to kick in here. By default, FastTrack looks for vowels as FastTrack is designed primarily for analysing vowels (or, to be more specific, vowel inherent spectral change: VICS). But the minimum requirement in using FastTrack is that formant structure is visible throughout each audio file to be analysed, and you can tweak the settings into analysing sonorant consonants (e.g., liquids, semi-vowels).
  1. Extract vowels (or vocalic portions of interest)
  • FastTrack estimates formants throughout each audio file. This means that it achieves the highest formant estimation accuracy when spectral structure is seen throughout each file.
  • FastTrack has Extract vowels function that enables you to extract the vocalic portions of your interest before analysis.
  1. Estimate formants using Track Folder
  • Once individual segments have been extracted, you can bulk-estimate formant frequencies for all audio files stored in the same directory.
  • Some parameters need to be specified, such as the maximum and minumum frequency of anlaysis windows, the number of formants to be extracted (3 or 4), the number of data points (i.e., `bins’) to be output etc.
  1. Visualise and run stats using R
  • This is the fun bit!

1.2 Workshop objectives

In this workshop, I will mainly explain and demonstrate steps 3-5 from above. If you would like to follow along, you can install FastTrack beforehand. A detailed step-by-step guide is available in Santiago’s Github repository with some video illustrations. See the wiki on his Github repository for the tutorial on installation (and many other things!)

1.3 data

We are going to analyse vowel production from ``the North Wind and the Sun’’ passage produced by speakers of different L1 backgrounds. We will use the data from the ALLSTAR Corpus. The ALLSTAR Corpus contains a number of spontaneous and scripted speech that were produced by English speakers from different language backgrounds:

Bradlow, A. R. (n.d.) ALLSSTAR: Archive of L1 and L2 Scripted and Spontaneous Transcripts And Recordings. Retrieved from https://oscaar3.ling.northwestern.edu/ALLSSTARcentral/#!/recordings.

You can download a subset of the corpus data to work with in this workshop from here.

The data contains recordings of the North Wind and the Sun passage by 22 speakers from various L1 backgrounds: Chinese-Cantonese (n = 4), Chinese-Mandarin (n = 4), English (n = 4), Japanese (n = 2), Korean (n = 4), and Spanish (n = 4). In each language group, half the speakers are female and the other half male. The file name convention is: ALL_[speaker number]_[gender: F or M]_[L1: CCT, CMN, ENG, JPN, KOR, SPA]_[L2: ENG]_NWS. Each audio file is accompanied by an annotated TextGrid file.

2 Let’s begin the analysis!

2.1 Step 1: Extracting vowels

We have recorded speech data from participants and/or obtained corpus data already. After some agony, we have managed to segment everything and are now ready to proceed onto acoustic analysis.

When using FastTrack, the first thing we need to do is to extract vocalic portions that we would like to analyse. Let’s extract vowels using FastTrack before submitting them to formant estimation.

FastTrack extracts segments specified in the spreadsheet vowelstoextract_default.csv. By default, the csv file lists vowels, but you can modify the list if interested in extracted other types of sounds (e.g., liquids, semi-vowels). You can find this by going to: FastTrack-master -> Fast Track -> dat.

  • When changing something here, I would strongly suggest that you keep the default file, too. FastTrack only recognises the spreadsheet when it’s named as vowelstoextract_default.csv, which means that you can just give a different name to the spreadsheets that you’d like to keep. For example, when I analysed liquids, I first copied the default file and renamed it into vowelstoextract_vowel.csv. I then modify the default file so that the list only contains /l/ and /r/.

2.1.1 Procedure

Here is a somewhat detailed workflow:

  1. Download data.zip and save it somewhere on your computer.

  2. FastTrack requires a certain repository structure, so let’s do this now. Specifically, we’ll need to save the audio and textgrid files in separate folder, named sounds and textgrids separately. Create new folders, give the appropriate names, and save the files in each folder.

  • It might also be useful to create an output folder at this stage, too. This is where the extracted files, which we will use for formant estimation later on, will be spitted out.
  1. Open Praat and throw a random file in the object window. This will trigger the FastTrack functions to appear in the menu section.

  2. Select Tools…, then Extract vowels with TextGrids.

  3. Once a window pops up, specify the following:

  • Sound folder:
    • Path to ``sounds’’ in the data folder containing .wav files.
  • TextGrid folder:
    • Path to ``textgrids’’ in the data folder containing .TextGrid files.
  • Output folder:
    • Path to the folder where you wish to save the outputs. You could specify an existing location.
  • Which tier contains segmentation information?:
    • Specify the tier in which phonemic transcription/segmentation has been performed. In the current example, the segmentation is done in Tier 2 so type 2.
  • Which tier contains word information?:
    • Specify the tier with words. Type 1 in this case.
  • Is stress marked on vowels?:
    • Tick the box if you wish to take stress into account. If you use MFA to segment the speech, stress is marked alongside each vowel. For example, you will find “AE1”, which means a TRAP vowel that bears the primary stress, or “AE2” the secondary stress, etc.
Vowel extraction setting window

Vowel extraction setting window

  1. Press OK and excute!

2.1.2 Output files

Let’s check what files have been created at this stage. Go to the output folder to check what it contains:

  • segmentation_information.csv: A detailed summary of the extraction process, including input and output (audio) files, labels, duration, previous and next adjacent segments and stress information.

  • file_information.csv: A brief summary of the correspondence between output files, labels and colours. Colours are relevant when visualising vowels using Praat.

  • sounds: Extracted audio files. You’ll notice that the file names now have some numbers added to the end (e.g., ALL_005_M_CMN_ENG_NWS_0002.wav), indicating the order of extraction from the original audio file.

2.2 Step 2: Tracking a folder

Having extracted vowels (or any vocalic segments), we’re now ready to move onto the fun bit: formant tracking! Here is what FastTrack does:

  • FastTrack does a lot of regressions and chooses the best analysis out of multiple candidates automatically. It can also return images of all candidates and the winners for visual inspection.

  • It estimates the formant frequencies at the multiple time points throughout the vowel duration.

  • The output is a csv file summarising the analysis, which can then be imported into R for tidy up, visualisation, statistics, etc.

2.2.1 Procedure

Formant estimation is based on the output folder from the extraction stage. Do not delete or move anything from the folder!

  1. Make sure that you know where the output folder is. This should contain at least: (1) file_information.csv, (2) segment_information.csv, and (3) a sounds folder containing a bunch of segmented audio files.

  2. Open Praat and throw a random file in the object window. This will trigger the FastTrack functions to appear in the menu section (if FastTrack is installed properly).

  3. Select Track folder….

  4. Specify the path to the output folder in the ``Folder’’ section. (Hint: this is not the path to the sounds folder!)

  5. Adjust parameters for your needs. This includes:

  • Lowest/highest analysis frequency: The range of upper limit of the frequency window that FastTrack seeks formants. FastTrack alters the ceiling of the analysis window in a number of steps to identify the most optimal formant estimation.

    • Note: It is recommended that we do formant tracking for female and male speakers separately due to anatomical differences such as vocal tract length (details here). Given this, we will conduct analysis for female speakers with the 5000-7000 Hz range and for male speakers 4500-6500 Hz range.
  • Number of steps: Basically the number of iteration of the upper limit adjustment. 24 here means that FastTrack adjusts the upper frequency limit in 24 steps from the lowest to highest analysis frequency that you have specified in the previous step.

  • Number of formants: Obviously how many formants you’d like to extract. This has an impact of the formant estimation accuracy, so explore a little if you don’t get a satisfactory analysis.

  • Make images comparing analysis/showing winners: You can choose whether you’d like to have an .png file for each analysis and each winner. I’d always tick the boxes here, but this depends on how much storage you have.

  • Also, do not forget to tick Show Progress: otherwise it’d look like the computer is frozen and it wouldn’t let you know how much time it takes to process everything.

  1. Hit OK and run!
Setting window for formant estimation (left) and an example of comparison image (right)Setting window for formant estimation (left) and an example of comparison image (right)

Setting window for formant estimation (left) and an example of comparison image (right)

2.2.2 Output files

You’ll get quite a few output files from this stage. Let’s take a look at some of them that are most relevant here:

  • aggregated_data.csv: A spreadsheet summarising (1) input file, (2) duration, (3) formant measurement averaged into the pre-specified number of bins (e.g., 11) for each vowel analysed here. This can be found in the processed_data folder. (See the image below for a quick overview.)

  • winners.csv: A spreadsheet summarising which analysis step yields the most accurate formant tracking based on regression. Useful when you are not satisfied with the winner auto-selected by FastTrack and identify another analysis to be better.

  • images_comparison folder: A folder showing the results of the step-wise formant estimation. Useful when evaluating formant tracking quality relative to each of the analysis steps.

  • images_winner: A folder containing images for each `winning’ analysis for each sound file.

  • csvs: A folder containing initial formant tracking sampled at 2ms interval (before FastTrack bins them into a smaller number of data points) for each winning analysis. Useful for a finer-grained analysis.

An example of Aggregate_data.csv

An example of Aggregate_data.csv

3 Data processing and analysis using R

Acoustic analysis is done, hooray! Now let’s move onto the more fun part – data wrangling, visualiastion and analysis using R. I can think of two broad paths to data analysis, and I’ll explain them one by one below.

3.1 Based on aggregated_data.csv

3.1.1 Data processing

An easier way of data analysis is to use aggregated_data.csv. Here, let’s convert the file format in a more tidyverse-friendly manner and then try some plotting.

Data transformation here is based on the codes originally written by Dr Sam Kirkham (Lancaster University). We will first import relevant data sets: aggregated_data.csv and segmentation_info.csv.

We have separate data sets for female and male speakers, so we’ll import them separately and merge them at a later stage.

Female data:

# load packages
library(tidyverse)

# import data sets: female data
## aggregated_data.csv
df_aggr_f <- readr::read_csv("/Volumes/Samsung_T5/data/female/output/processed_data/aggregated_data.csv")
## "Aggregated_data" contains information about formant frequency values and duration, f0 etc. but lacks in other information.

## segmentation info
df_segment_info_f <- readr::read_csv("/Volumes/Samsung_T5/data/female/output/segmentation_information.csv")
## "segmentation information" supplement the "aggregated_data.csv" with information about the context of the extracted sounds, vowel duration, stress, comments, etc.

df_segment_info_f <- df_segment_info_f |> 
  dplyr::rename(file = outputfile)
## Rename the "outputfile" column to "file" so that it is compatible with the "aggregated_data.csv".

df_f <- merge(df_aggr_f, df_segment_info_f, by = "file", all = T)
## Merging the two csv files by the "file" column

df_f <- na.omit(df_f) # omitting NA

Male data:

# import data sets: female data
## aggregated_data.csv
df_aggr_m <- readr::read_csv("/Volumes/Samsung_T5/data/male/output/processed_data/aggregated_data.csv")
## "Aggregated_data" contains information about formant frequency values and duration, f0 etc. but lacks in other information.

## segmentation info
df_segment_info_m <- readr::read_csv("/Volumes/Samsung_T5/data/male/output/segmentation_information.csv")
## "segmentation information" supplement the "aggregated_data.csv" with information about the context of the extracted sounds, vowel duration, stress, comments, etc.

df_segment_info_m <- df_segment_info_m |> 
  dplyr::rename(file = outputfile)
## Rename the "outputfile" column to "file" so that it is compatible with the "aggregated_data.csv".

df_m <- merge(df_aggr_m, df_segment_info_m, by = "file", all = T)
## Merging the two csv files by the "file" column

df_m <- na.omit(df_m) # omitting NA

Let’s then merge the female and male data and make the data frame into a long data.

# combine female and male data
df <- rbind(df_f, df_m)

df_long <- df %>%
  tidyr::pivot_longer(contains(c("f1", "f2", "f3")), # add "f4" if you extract F4 as well 
               names_to = c("formant", "timepoint"), 
               names_pattern = "(f\\d)(\\d+)",
               values_to = "hz")

Then, we will add proportional time information here – we have extracted the formant frequencies that are summarised into 11 bins, meaning that we can express the temporal information from 0% to 100% with a 10% increment.

If you recall, the audio file names contain information about the speaker background. Let’s add them into the data frame, too.

df_long <- df_long |> 
  tidyr::spread(key = formant, value = hz) |> 
  dplyr::select(-duration.y) |> # drop one of the two duration columns
  dplyr::rename(
    duration = duration.x) |> # rename the duration column
  dplyr::mutate(
    timepoint = as.numeric(timepoint),  
    percent = (timepoint - 1) * 10, # adding proportional time
    speaker =
      str_sub(file, start = 5, end = 7), # speaker ID: three digits
    speaker = as.factor(speaker),
    gender = 
      str_sub(file, start = 9, end = 9), # gender: F or M
    L1 =
      str_sub(file, start = 11, end = 13), # L1: CMN, CCT ...
  )

# within-speaker normalisation
df_long <- df_long |> 
  dplyr::group_by(speaker) |> 
  dplyr::mutate(
    f1z = scale(f1),
    f2z = scale(f2),
    f3z = scale(f3)
  ) |> 
  dplyr::ungroup()

# check data
df_long |> 
  dplyr::group_by(gender, L1) |> 
  dplyr::summarise() |> 
  dplyr::ungroup()
## `summarise()` has grouped output by 'gender'. You can override using the
## `.groups` argument.
## # A tibble: 12 × 2
##    gender L1   
##    <chr>  <chr>
##  1 F      CCT  
##  2 F      CMN  
##  3 F      ENG  
##  4 F      JPN  
##  5 F      KOR  
##  6 F      SPA  
##  7 M      CCT  
##  8 M      CMN  
##  9 M      ENG  
## 10 M      JPN  
## 11 M      KOR  
## 12 M      SPA

The data looks good! On we go to visualisation!

3.1.2 Data visualisation

Let’s try some data visualisation. Having temporal information in a proportional manner is useful because you can extract formant frequencies at an arbitrary point in time during each vowel interval.

Let’s first try visualising monophthongs based on midpoint measurement. We’ll omit tokens that are surrounded by liquids and semi-vowels

df_long_mono <- df_long |> 
  dplyr::filter(
    !next_sound %in% c("R", "W", "Y"), # monophthongs followed by /r/, /w/, and /j/ were avoided
    !previous_sound %in% c("R", "W", "Y"), # monophthongs preceded by /r/, /w/, and /j/ were avoided
    !next_sound %in% c("L", "NG"), # monophthongs followed by /l/ and /ng/ were avoided
    percent == 50, # specifying vowel midpoint
    !(vowel %in% c("AW", "AY", "EY", "OW", "OY")) # monophthongs
  ) |> 
  dplyr::mutate(
    vowel_ipa =
      case_when(
        str_detect(vowel, "AA") ~ "ɑ",
        str_detect(vowel, "AE") ~ "æ",
        str_detect(vowel, "AH") ~ "ʌ",
        str_detect(vowel, "AO") ~ "ɔ",
        str_detect(vowel, "EH") ~ "ɛ",
        str_detect(vowel, "ER") ~ "ɝ",
        str_detect(vowel, "IH") ~ "ɪ",
        str_detect(vowel, "IY") ~ "i",
        str_detect(vowel, "UH") ~ "ʊ",
        str_detect(vowel, "UW") ~ "u",
        )
    ) # add IPA symbols for visualisation

We need mean formant values where we’ll put the IPA labels.

# Calculate vowel means
df_mean <- df_long_mono |> 
  dplyr::group_by(gender, vowel, vowel_ipa) |> 
  dplyr::summarise(
    m_f1 = mean(f1z),
    m_f2 = mean(f2z)
  ) |> 
  dplyr::ungroup()
## `summarise()` has grouped output by 'gender', 'vowel'. You can override using
## the `.groups` argument.
# plot
df_long_mono |> 
  dplyr::mutate(
    L1 = case_when(
      L1 == "CCT" ~ "Cantonese",
      L1 == "CMN" ~ "Mandarin",
      L1 == "ENG" ~ "English",
      L1 == "KOR" ~ "Korean",
      L1 == "JPN" ~ "Japanese",
      L1 == "SPA" ~ "Spanish",
    ) # making L1 labels to be more readable
  ) |> 
  ggplot(aes(x = f2z, y = f1z, colour = vowel_ipa)) +
  geom_point(size = 1, alpha = 0.5, show.legend = FALSE) +
  geom_label(data = df_mean, aes(x = m_f2, y = m_f1, label = vowel_ipa, colour = vowel), show.legend = FALSE) +
  scale_x_reverse(position = "top") +
  scale_y_reverse(position = "right") +
  labs(x = "normalised F2\n", y = "normalised F1\n", title = "vowel midpoint") +
  facet_grid(gender ~ L1) +
  theme(axis.text = element_text(size = 8),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15, angle = 0),
        plot.title = element_text(size = 20, hjust = 0, face = "bold")
  ) 

We can also plot temporal changes in formant frequency. For example, here is a comparison of F2 dynamics between L1 English and L1 Japanese speakers. Please feel free to explore any other L1 comparisons!

df_long |> 
  dplyr::filter(
    L1 %in% c("ENG", "JPN") # change for different L1 pairs
  ) |> 
  ggplot(aes(x = percent, y = f2z, colour = L1)) +
  geom_point(alpha = 0.05) +
  geom_path(aes(group = number), alpha = 0.05) +
  geom_smooth(aes(group = L1)) + # you could also add smooths
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
  labs(x = "proportional time", y = "normalised F2\n", title = "F2 dynamics") +
  facet_wrap( ~ vowel) +
  scale_colour_manual(values = alpha(c("brown4", "blue4"))) +
  theme(axis.text = element_text(size = 10),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15, angle = 0),
        plot.title = element_text(size = 20, hjust = 0, face = "bold")
  ) 
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

3.1.3 Data analysis

Finally, we could try fitting some statistical models to investigate whether vowel realisations differ depending on the speaker’s L1 background.

We could fit ordinary linear-mixed effect models for the midpoint measurement.

library(lme4)
library(lmerTest)
library(emmeans)

# converting variables into factor and dropping empty levels
df_long_mono$vowel <- droplevels(as.factor(df_long_mono$vowel)) 
df_long_mono$vowel <- as.factor(df_long_mono$vowel)
df_long_mono$speaker <- as.factor(df_long_mono$speaker)
df_long_mono$L1 <- as.factor(df_long_mono$L1)

# run model -- random intercepts for speaker made the model unable to converge so we just have random intercepts for item (i.e., word)
m1 <- lme4::lmer(f2z ~ L1 + vowel + L1:vowel + (1|word), data = df_long_mono, REML = FALSE)

## check what optimiser would let the model converge
lme4::allFit(m1)
## bobyqa : [OK]
## Nelder_Mead : [OK]
## nlminbwrap : [OK]
## optimx.L-BFGS-B : [OK]
## nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
## nloptwrap.NLOPT_LN_BOBYQA : [OK]
## original model:
## f2z ~ L1 + vowel + L1:vowel + (1 | word) 
## data:  df_long_mono 
## optimizers (6): bobyqa, Nelder_Mead, nlminbwrap, optimx.L-BFGS-B,nloptwrap.NLOPT_LN_NELDERME...
## differences in negative log-likelihoods:
## max= 1.36e-08 ; std dev= 5.51e-09
## model summary
summary(m1)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: f2z ~ L1 + vowel + L1:vowel + (1 | word)
##    Data: df_long_mono
## 
##      AIC      BIC   logLik deviance df.resid 
##   1733.1   2005.3   -810.6   1621.1      898 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.7994 -0.4515  0.0443  0.4749  5.3014 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  word     (Intercept) 0.05468  0.2338  
##  Residual             0.30162  0.5492  
## Number of obs: 954, groups:  word, 33
## 
## Fixed effects:
##                Estimate Std. Error t value
## (Intercept)    0.313053   0.125491   2.495
## L1CMN         -0.099816   0.130840  -0.763
## L1ENG          0.116350   0.157106   0.741
## L1JPN         -0.055266   0.151888  -0.364
## L1KOR         -0.089008   0.126914  -0.701
## L1SPA         -0.424268   0.133042  -3.189
## vowelAH       -0.721014   0.180617  -3.992
## vowelAO       -1.285112   0.314431  -4.087
## vowelEH        0.350424   0.206343   1.698
## vowelER       -0.503719   0.381886  -1.319
## vowelIH        1.089329   0.195988   5.558
## vowelIY        1.190798   0.215988   5.513
## vowelUH       -0.632208   0.243127  -2.600
## vowelUW       -0.944802   0.216583  -4.362
## L1CMN:vowelAH  0.115546   0.180073   0.642
## L1ENG:vowelAH -0.145464   0.202512  -0.718
## L1JPN:vowelAH  0.035987   0.211638   0.170
## L1KOR:vowelAH -0.116334   0.178573  -0.651
## L1SPA:vowelAH  0.352460   0.174688   2.018
## L1CMN:vowelAO  0.378034   0.298479   1.267
## L1ENG:vowelAO  0.075532   0.316366   0.239
## L1JPN:vowelAO  0.148117   0.369022   0.401
## L1KOR:vowelAO  0.216276   0.302510   0.715
## L1SPA:vowelAO  1.424216   0.305131   4.668
## L1CMN:vowelEH  0.015590   0.230184   0.068
## L1ENG:vowelEH -0.433198   0.242784  -1.784
## L1JPN:vowelEH  0.032354   0.275119   0.118
## L1KOR:vowelEH -0.394857   0.229709  -1.719
## L1SPA:vowelEH  0.347002   0.231372   1.500
## L1CMN:vowelER  0.644001   0.409792   1.572
## L1ENG:vowelER -0.003092   0.418918  -0.007
## L1JPN:vowelER -0.049103   0.499284  -0.098
## L1KOR:vowelER  0.219972   0.408555   0.538
## L1SPA:vowelER  0.789466   0.410500   1.923
## L1CMN:vowelIH -0.156048   0.202372  -0.771
## L1ENG:vowelIH -0.830810   0.255930  -3.246
## L1JPN:vowelIH -0.032785   0.236510  -0.139
## L1KOR:vowelIH -0.195451   0.207896  -0.940
## L1SPA:vowelIH  0.079539   0.196537   0.405
## L1CMN:vowelIY  0.235240   0.227439   1.034
## L1ENG:vowelIY  0.539427   0.243878   2.212
## L1JPN:vowelIY  0.219901   0.270993   0.811
## L1KOR:vowelIY  0.074341   0.222975   0.333
## L1SPA:vowelIY  0.036647   0.229116   0.160
## L1CMN:vowelUH  0.415790   0.259594   1.602
## L1ENG:vowelUH  0.127067   0.273774   0.464
## L1JPN:vowelUH  0.461182   0.329909   1.398
## L1KOR:vowelUH  0.593944   0.267355   2.222
## L1SPA:vowelUH  0.375699   0.260711   1.441
## L1CMN:vowelUW  0.207827   0.223181   0.931
## L1ENG:vowelUW  0.648317   0.248848   2.605
## L1JPN:vowelUW  0.240342   0.279040   0.861
## L1KOR:vowelUW  0.221985   0.225122   0.986
## L1SPA:vowelUW  0.379884   0.226304   1.679
# significance testing
## nested model for the interaction
m2 <- lme4::lmer(f2z ~ L1 + vowel + (1|word), data = df_long_mono, REML = FALSE)

## model comparison: full model significantly improves the model fit
anova(m1, m2, test = "Chisq")
## Data: df_long_mono
## Models:
## m2: f2z ~ L1 + vowel + (1 | word)
## m1: f2z ~ L1 + vowel + L1:vowel + (1 | word)
##    npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## m2   16 1756.1 1833.9 -862.06   1724.1                         
## m1   56 1733.2 2005.3 -810.57   1621.2 102.97 40  1.859e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## post-hoc anlaysis
emmeans::emmeans(m1, pairwise ~ L1 | vowel)
## $emmeans
## vowel = AE:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT  0.31305 0.139 155.7   0.0389   0.5872
##  CMN  0.21324 0.144 175.9  -0.0709   0.4974
##  ENG  0.42940 0.171 310.4   0.0938   0.7650
##  JPN  0.25779 0.164 296.0  -0.0659   0.5815
##  KOR  0.22405 0.140 160.6  -0.0527   0.5008
##  SPA -0.11121 0.145 187.1  -0.3977   0.1753
## 
## vowel = AH:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT -0.40796 0.146 101.4  -0.6973  -0.1187
##  CMN -0.39223 0.148 109.1  -0.6861  -0.0984
##  ENG -0.43707 0.152 121.8  -0.7383  -0.1359
##  JPN -0.42724 0.169 194.3  -0.7612  -0.0933
##  KOR -0.61330 0.150 114.1  -0.9100  -0.3166
##  SPA -0.47977 0.141  84.3  -0.7597  -0.1999
## 
## vowel = AO:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT -0.97206 0.327 133.0  -1.6182  -0.3259
##  CMN -0.69384 0.299 143.5  -1.2842  -0.1034
##  ENG -0.78018 0.327 133.0  -1.4263  -0.1340
##  JPN -0.87921 0.383 257.0  -1.6326  -0.1258
##  KOR -0.84479 0.327 133.0  -1.4910  -0.1986
##  SPA  0.02789 0.327 133.0  -0.6183   0.6741
## 
## vowel = EH:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT  0.66348 0.195 267.6   0.2791   1.0478
##  CMN  0.57925 0.183 257.2   0.2181   0.9404
##  ENG  0.34663 0.167 236.2   0.0168   0.6764
##  JPN  0.64057 0.226 508.4   0.1958   1.0854
##  KOR  0.17961 0.189 262.5  -0.1923   0.5516
##  SPA  0.58621 0.183 257.2   0.2250   0.9474
## 
## vowel = ER:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT -0.19067 0.397 158.3  -0.9752   0.5939
##  CMN  0.35352 0.397 158.3  -0.4311   1.1381
##  ENG -0.07741 0.397 158.3  -0.8620   0.7072
##  JPN -0.29503 0.487 345.3  -1.2527   0.6627
##  KOR -0.05970 0.397 158.3  -0.8443   0.7249
##  SPA  0.17453 0.397 158.3  -0.6100   0.9591
## 
## vowel = IH:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT  1.40238 0.168 121.1   1.0706   1.7341
##  CMN  1.14652 0.171 134.1   0.8075   1.4856
##  ENG  0.68792 0.214 311.5   0.2669   1.1089
##  JPN  1.31433 0.199 243.0   0.9231   1.7056
##  KOR  1.11792 0.182 172.0   0.7594   1.4764
##  SPA  1.05765 0.163 106.8   0.7344   1.3809
## 
## vowel = IY:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT  1.50385 0.194 140.9   1.1195   1.8882
##  CMN  1.63928 0.197 149.6   1.2495   2.0290
##  ENG  2.15963 0.199 152.4   1.7668   2.5525
##  JPN  1.66849 0.236 302.3   1.2043   2.1326
##  KOR  1.48918 0.195 141.9   1.1038   1.8746
##  SPA  1.11623 0.199 152.4   0.7234   1.5091
## 
## vowel = UH:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT -0.31916 0.229 157.0  -0.7722   0.1338
##  CMN -0.00318 0.229 157.0  -0.4562   0.4498
##  ENG -0.07574 0.229 157.0  -0.5287   0.3773
##  JPN  0.08676 0.300 419.3  -0.5028   0.6763
##  KOR  0.18578 0.241 192.1  -0.2891   0.6607
##  SPA -0.36772 0.229 157.0  -0.8207   0.0853
## 
## vowel = UW:
##  L1    emmean    SE    df lower.CL upper.CL
##  CCT -0.63175 0.195 140.6  -1.0177  -0.2458
##  CMN -0.52374 0.192 131.5  -0.9033  -0.1441
##  ENG  0.13292 0.208 175.1  -0.2775   0.5434
##  JPN -0.44667 0.248 342.6  -0.9348   0.0415
##  KOR -0.49877 0.199 151.1  -0.8926  -0.1049
##  SPA -0.67613 0.195 140.6  -1.0621  -0.2902
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## 
## $contrasts
## vowel = AE:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN  0.09982 0.134 969   0.744  0.9764
##  CCT - ENG -0.11635 0.161 982  -0.721  0.9794
##  CCT - JPN  0.05527 0.156 968   0.355  0.9993
##  CCT - KOR  0.08901 0.130 968   0.684  0.9837
##  CCT - SPA  0.42427 0.136 968   3.110  0.0236
##  CMN - ENG -0.21617 0.165 980  -1.311  0.7788
##  CMN - JPN -0.04455 0.160 968  -0.279  0.9998
##  CMN - KOR -0.01081 0.135 969  -0.080  1.0000
##  CMN - SPA  0.32445 0.141 969   2.299  0.1954
##  ENG - JPN  0.17162 0.183 978   0.939  0.9364
##  ENG - KOR  0.20536 0.162 980   1.271  0.8006
##  ENG - SPA  0.54062 0.167 979   3.244  0.0154
##  JPN - KOR  0.03374 0.156 968   0.216  0.9999
##  JPN - SPA  0.36900 0.162 969   2.279  0.2036
##  KOR - SPA  0.33526 0.137 969   2.440  0.1437
## 
## vowel = AH:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.01573 0.127 968  -0.124  1.0000
##  CCT - ENG  0.02911 0.131 971   0.222  0.9999
##  CCT - JPN  0.01928 0.151 967   0.128  1.0000
##  CCT - KOR  0.20534 0.129 969   1.594  0.6029
##  CCT - SPA  0.07181 0.116 974   0.618  0.9897
##  CMN - ENG  0.04484 0.133 970   0.337  0.9994
##  CMN - JPN  0.03501 0.153 968   0.229  0.9999
##  CMN - KOR  0.22107 0.131 968   1.686  0.5413
##  CMN - SPA  0.08754 0.119 973   0.737  0.9772
##  ENG - JPN -0.00983 0.157 970  -0.063  1.0000
##  ENG - KOR  0.17623 0.135 968   1.308  0.7809
##  ENG - SPA  0.04269 0.123 973   0.347  0.9993
##  JPN - KOR  0.18606 0.155 969   1.202  0.8361
##  JPN - SPA  0.05253 0.144 971   0.364  0.9992
##  KOR - SPA -0.13353 0.121 973  -1.103  0.8803
## 
## vowel = AO:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.27822 0.275 984  -1.010  0.9147
##  CCT - ENG -0.19188 0.282 967  -0.681  0.9840
##  CCT - JPN -0.09285 0.345 967  -0.269  0.9998
##  CCT - KOR -0.12727 0.282 967  -0.452  0.9976
##  CCT - SPA -0.99995 0.282 967  -3.551  0.0054
##  CMN - ENG  0.08634 0.275 984   0.313  0.9996
##  CMN - JPN  0.18537 0.340 978   0.545  0.9942
##  CMN - KOR  0.15095 0.275 984   0.548  0.9941
##  CMN - SPA -0.72173 0.275 984  -2.620  0.0934
##  ENG - JPN  0.09903 0.345 967   0.287  0.9997
##  ENG - KOR  0.06461 0.282 967   0.229  0.9999
##  ENG - SPA -0.80807 0.282 967  -2.870  0.0480
##  JPN - KOR -0.03442 0.345 967  -0.100  1.0000
##  JPN - SPA -0.90710 0.345 967  -2.630  0.0910
##  KOR - SPA -0.87268 0.282 967  -3.099  0.0244
## 
## vowel = EH:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN  0.08423 0.194 972   0.434  0.9981
##  CCT - ENG  0.31685 0.189 995   1.676  0.5481
##  CCT - JPN  0.02291 0.235 971   0.097  1.0000
##  CCT - KOR  0.48386 0.196 969   2.464  0.1358
##  CCT - SPA  0.07727 0.194 972   0.398  0.9987
##  CMN - ENG  0.23262 0.180 979   1.289  0.7911
##  CMN - JPN -0.06131 0.230 967  -0.267  0.9998
##  CMN - KOR  0.39964 0.191 969   2.097  0.2898
##  CMN - SPA -0.00696 0.188 967  -0.037  1.0000
##  ENG - JPN -0.29394 0.224 975  -1.312  0.7784
##  ENG - KOR  0.16702 0.184 986   0.906  0.9451
##  ENG - SPA -0.23958 0.180 979  -1.328  0.7696
##  JPN - KOR  0.46095 0.232 968   1.984  0.3520
##  JPN - SPA  0.05435 0.230 967   0.236  0.9999
##  KOR - SPA -0.40660 0.191 969  -2.133  0.2710
## 
## vowel = ER:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.54419 0.398 967  -1.367  0.7471
##  CCT - ENG -0.11326 0.398 967  -0.284  0.9998
##  CCT - JPN  0.10437 0.488 967   0.214  0.9999
##  CCT - KOR -0.13096 0.398 967  -0.329  0.9995
##  CCT - SPA -0.36520 0.398 967  -0.917  0.9422
##  CMN - ENG  0.43093 0.398 967   1.082  0.8886
##  CMN - JPN  0.64855 0.488 967   1.330  0.7685
##  CMN - KOR  0.41322 0.398 967   1.038  0.9052
##  CMN - SPA  0.17899 0.398 967   0.449  0.9977
##  ENG - JPN  0.21763 0.488 967   0.446  0.9978
##  ENG - KOR -0.01771 0.398 967  -0.044  1.0000
##  ENG - SPA -0.25194 0.398 967  -0.633  0.9886
##  JPN - KOR -0.23533 0.488 967  -0.483  0.9968
##  JPN - SPA -0.46957 0.488 967  -0.963  0.9295
##  KOR - SPA -0.23423 0.398 967  -0.588  0.9918
## 
## vowel = IH:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN  0.25586 0.158 969   1.616  0.5882
##  CCT - ENG  0.71446 0.208 989   3.441  0.0079
##  CCT - JPN  0.08805 0.186 970   0.474  0.9970
##  CCT - KOR  0.28446 0.169 969   1.684  0.5424
##  CCT - SPA  0.34473 0.148 968   2.324  0.1855
##  CMN - ENG  0.45860 0.210 984   2.189  0.2438
##  CMN - JPN -0.16781 0.190 970  -0.883  0.9505
##  CMN - KOR  0.02859 0.173 969   0.165  1.0000
##  CMN - SPA  0.08886 0.153 969   0.580  0.9924
##  ENG - JPN -0.62641 0.234 989  -2.678  0.0806
##  ENG - KOR -0.43000 0.218 983  -1.971  0.3599
##  ENG - SPA -0.36973 0.204 990  -1.812  0.4583
##  JPN - KOR  0.19641 0.199 971   0.986  0.9224
##  JPN - SPA  0.25668 0.182 969   1.414  0.7188
##  KOR - SPA  0.06027 0.165 969   0.366  0.9991
## 
## vowel = IY:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.13542 0.191 970  -0.710  0.9808
##  CCT - ENG -0.65578 0.191 974  -3.426  0.0083
##  CCT - JPN -0.16464 0.230 969  -0.715  0.9801
##  CCT - KOR  0.01467 0.188 970   0.078  1.0000
##  CCT - SPA  0.38762 0.191 974   2.025  0.3286
##  CMN - ENG -0.52035 0.193 969  -2.690  0.0782
##  CMN - JPN -0.02921 0.232 968  -0.126  1.0000
##  CMN - KOR  0.15009 0.191 968   0.788  0.9696
##  CMN - SPA  0.52304 0.193 969   2.704  0.0754
##  ENG - JPN  0.49114 0.232 968   2.114  0.2808
##  ENG - KOR  0.67044 0.191 969   3.516  0.0061
##  ENG - SPA  1.04340 0.193 967   5.401  <.0001
##  JPN - KOR  0.17930 0.230 967   0.780  0.9709
##  JPN - SPA  0.55226 0.232 968   2.377  0.1653
##  KOR - SPA  0.37295 0.191 969   1.956  0.3687
## 
## vowel = UH:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.31597 0.230 967  -1.374  0.7425
##  CCT - ENG -0.24342 0.230 967  -1.059  0.8975
##  CCT - JPN -0.40592 0.300 969  -1.351  0.7560
##  CCT - KOR -0.50494 0.241 968  -2.092  0.2920
##  CCT - SPA  0.04857 0.230 967   0.211  0.9999
##  CMN - ENG  0.07256 0.230 967   0.316  0.9996
##  CMN - JPN -0.08994 0.300 969  -0.299  0.9997
##  CMN - KOR -0.18896 0.241 968  -0.783  0.9704
##  CMN - SPA  0.36454 0.230 967   1.586  0.6084
##  ENG - JPN -0.16250 0.300 969  -0.541  0.9945
##  ENG - KOR -0.26152 0.241 968  -1.084  0.8879
##  ENG - SPA  0.29199 0.230 967   1.270  0.8014
##  JPN - KOR -0.09902 0.309 969  -0.321  0.9996
##  JPN - SPA  0.45448 0.300 969   1.513  0.6559
##  KOR - SPA  0.55351 0.241 968   2.294  0.1975
## 
## vowel = UW:
##  contrast  estimate    SE  df t.ratio p.value
##  CCT - CMN -0.10801 0.185 969  -0.582  0.9922
##  CCT - ENG -0.76467 0.198 976  -3.861  0.0017
##  CCT - JPN -0.18508 0.240 973  -0.771  0.9724
##  CCT - KOR -0.13298 0.191 970  -0.697  0.9823
##  CCT - SPA  0.04438 0.188 967   0.236  0.9999
##  CMN - ENG -0.65666 0.197 984  -3.337  0.0113
##  CMN - JPN -0.07706 0.239 979  -0.322  0.9995
##  CMN - KOR -0.02497 0.189 975  -0.132  1.0000
##  CMN - SPA  0.15239 0.185 969   0.822  0.9635
##  ENG - JPN  0.57959 0.247 968   2.350  0.1754
##  ENG - KOR  0.63169 0.200 970   3.160  0.0202
##  ENG - SPA  0.80905 0.198 976   4.085  0.0007
##  JPN - KOR  0.05210 0.242 969   0.216  0.9999
##  JPN - SPA  0.22946 0.240 973   0.955  0.9317
##  KOR - SPA  0.17736 0.191 970   0.930  0.9388
## 
## Degrees-of-freedom method: kenward-roger 
## P value adjustment: tukey method for comparing a family of 6 estimates

3.2 Based on information from the csv folder

Another way of spectral analysis would be to use FastTrack’s initial sampling at every 2ms. This information is stored in the csv folder.

3.2.1 Data processing

Let’s import all .csv files stored in the csv folder by running the loop below. Again, we’ll import female and male data separately and merge them later.

Female data:

## loading data
# index csv files in the directory
file_list <- list.files("/Volumes/Samsung_T5/data/female/output/csvs", pattern = "*.csv", full.names = TRUE)

# create an empty list to store data
data_list <- list()

for(i in seq_along(file_list)){
  current_data <- read.csv(file_list[i], header = TRUE)
  
  # Add a new column with the filename
  current_data$filename <- basename(file_list[i])
  
  data_list[[i]] <- current_data
}

# bind all data from the list into a data frame
dat_f <- dplyr::bind_rows(data_list) |> 
  dplyr::relocate(filename)

# View the result
head(dat_f)
##                         filename  time    f1    b1     f2    b2     f3    b3
## 1 ALL_011_F_CMN_ENG_NWS_0001.csv 0.026 542.5 261.8 1312.4 218.7 2492.4 244.7
## 2 ALL_011_F_CMN_ENG_NWS_0001.csv 0.028 544.2 253.5 1308.2 196.9 2479.9 237.3
## 3 ALL_011_F_CMN_ENG_NWS_0001.csv 0.030 547.4 248.3 1306.2 180.4 2469.7 225.5
## 4 ALL_011_F_CMN_ENG_NWS_0001.csv 0.032 552.1 245.4 1305.6 167.4 2461.2 209.9
## 5 ALL_011_F_CMN_ENG_NWS_0001.csv 0.034 557.7 244.4 1305.9 157.0 2453.8 193.9
## 6 ALL_011_F_CMN_ENG_NWS_0001.csv 0.036 564.8 245.5 1306.9 148.5 2447.1 180.5
##     f1p    f2p    f3p    f0 intensity harmonicity
## 1 543.1 1301.4 2480.4 270.2      69.3        30.9
## 2 545.2 1302.8 2477.2 270.2      69.4        31.0
## 3 548.7 1304.9 2472.2 270.2      69.4        31.0
## 4 553.2 1307.5 2465.8 270.3      69.4        31.1
## 5 558.7 1310.1 2458.4 270.3      69.4        31.1
## 6 564.8 1312.5 2450.5 270.3      69.4        31.1

Male data:

## loading data
# index csv files in the directory
file_list <- list.files("/Volumes/Samsung_T5/data/male/output/csvs", pattern = "*.csv", full.names = TRUE)

# create an empty list to store data
data_list <- list()

for(i in seq_along(file_list)){
  current_data <- read.csv(file_list[i], header = TRUE)
  
  # Add a new column with the filename
  current_data$filename <- basename(file_list[i])
  
  data_list[[i]] <- current_data
}

# bind all data from the list into a data frame
dat_m <- dplyr::bind_rows(data_list) |> 
  dplyr::relocate(filename)

# View the result
head(dat_m)
##                         filename  time    f1    b1     f2    b2     f3    b3
## 1 ALL_005_M_CMN_ENG_NWS_0001.csv 0.026 549.9 163.9 1491.2 136.0 2742.7 209.1
## 2 ALL_005_M_CMN_ENG_NWS_0001.csv 0.028 555.1 182.4 1472.8 141.1 2746.2 197.4
## 3 ALL_005_M_CMN_ENG_NWS_0001.csv 0.030 563.3 197.4 1455.9 147.3 2754.1 170.8
## 4 ALL_005_M_CMN_ENG_NWS_0001.csv 0.032 569.5 208.0 1439.2 156.5 2762.6 146.1
## 5 ALL_005_M_CMN_ENG_NWS_0001.csv 0.034 573.0 214.2 1421.6 159.1 2768.9 130.2
## 6 ALL_005_M_CMN_ENG_NWS_0001.csv 0.036 576.9 213.2 1409.1 143.7 2769.9 126.7
##     f1p    f2p    f3p    f0 intensity harmonicity
## 1 577.6 1458.8 2761.9 175.6      71.1        16.0
## 2 577.2 1457.0 2759.7 176.1      70.9        16.1
## 3 576.7 1454.1 2756.0 176.7      70.8        16.1
## 4 576.0 1450.1 2751.0 177.3      70.7        15.9
## 5 575.4 1444.9 2745.0 178.0      70.6        15.8
## 6 574.8 1438.6 2737.9 178.7      70.6        15.7

And let’s add some relevant information.

# merge female and male data
dat <- rbind(dat_f, dat_m)

# adding speaker, L1, gender etc from the file name
dat <- dat |> 
  dplyr::mutate(
    speaker =
      str_sub(filename, start = 5, end = 7), # speaker ID: three digits
    speaker = as.factor(speaker),
    gender = 
      str_sub(filename, start = 9, end = 9), # gender: F or M
    L1 =
      str_sub(filename, start = 11, end = 13), # L1: CMN, CCT ...
  )

# adding proportional time
dat <- dat |> 
  dplyr::group_by(filename) |> 
  dplyr::mutate(
    duration = max(time) - min(time),
    percent = (time - min(time)) / duration * 100 # make sure percent starts at 0 and ends at 100
  ) |> 
  dplyr::ungroup() |> 
  dplyr::relocate(filename, time, percent)

# within-speaker normalise formant
dat <- dat |> 
  dplyr::group_by(speaker) |> 
  dplyr::mutate(
    f1z = scale(f1),
    f2z = scale(f2),
    f3z = scale(f3)
  ) |> 
  dplyr::ungroup()

We also need to combine vowel information. This is where file_information.csv can be useful as it shows the correspondence between the filename and vowel.

# import file_information.csv
## female
df_file_f <- readr::read_csv("/Volumes/Samsung_T5/data/female/output/file_information.csv")

## male
df_file_m <- readr::read_csv("/Volumes/Samsung_T5/data/male/output/file_information.csv")

## merge
df_file <- rbind(df_file_f, df_file_m)

# create a common key to merge two data sets
## omit the extention from the "filename" column from dat and call it "file"
dat <- dat |> 
  dplyr::mutate(
    file = str_sub(filename, start = 1, end = -5)
  ) |> 
  dplyr::relocate(file)

## same for df_file
df_file <- df_file |> 
  dplyr::mutate(
    file = str_sub(file, start = 1, end = -5)
  ) |> 
  dplyr::relocate(file)

# join df_file and dat with the "file" information
dat <- dplyr::left_join(dat, df_file, by = "file") |> 
  dplyr::rename(
    vowel = label
  )

3.2.2 Data visualisation

Compared to the dynamic visualisation based on the aggregate_data.csv, you can see that we now have much finer temporal resolution from the number of data points! Again, here is the time-varying changes in F2.

dat |> 
  dplyr::filter(
    L1 %in% c("ENG", "JPN") # change for different L1 pairs
    ) |> 
  ggplot(aes(x = percent, y = f2z, colour = L1)) +
  geom_point(alpha = 0.05) +
  geom_path(aes(group = number), alpha = 0.05) +
  geom_smooth(aes(group = L1)) + # you could also add smooths
  geom_hline(yintercept = 0, linetype = "dashed", alpha = 0.5) +
  labs(x = "proportional time", y = "normalised F2\n", title = "F2 dynamics") +
  facet_wrap( ~ vowel) +
  scale_colour_manual(values = alpha(c("brown4", "blue4"))) +
  theme(axis.text = element_text(size = 10),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15, angle = 0),
        plot.title = element_text(size = 20, hjust = 0, face = "bold")
  )  
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

4 Improving analysis

4.1 Manually correcting measurement errors

We’re now familiar with the overall workflow of acoustic analysis using FastTrack. Hooray! FastTrack is very efficient in analysing a large number of vowel tokens. In the data set above, we had a total of 1,964 tokens with the breakdown shown below:

# total number of tokens
dat |> 
  dplyr::group_by(file, L1) |> 
  dplyr::filter(
    percent == "0" # to make sure we only count one data point per file
  ) |> 
  dplyr::ungroup() |> 
  dplyr::count() 
## # A tibble: 1 × 1
##       n
##   <int>
## 1  1964
# by L1
dat |> 
  dplyr::group_by(file, L1) |> 
  dplyr::filter(
    percent == "0" # to make sure we only count one data point per file
  ) |> 
  dplyr::ungroup() |> 
  dplyr::group_by(L1) |> 
  dplyr::count() |> 
  dplyr::ungroup()
## # A tibble: 6 × 2
##   L1        n
##   <chr> <int>
## 1 CCT     368
## 2 CMN     356
## 3 ENG     318
## 4 JPN     186
## 5 KOR     352
## 6 SPA     384

However, it is also quite obvious that FastTrack is not free from errors. This is especially important for dynamic analysis, as we’ve found that there are some potential measurement errors.

FastTrack has a few ways to address tracking errors. First, it is possible to manually correct the formant tracking on Praat (but via FastTrack). I have personally never done this, but you can find more information about this here

4.2 Nominating different winners

An alternative approach, which I usually do, is to check the tracking accuracy of the rest of the analyses and see whether there is any ‘better’ analysis. Among the output files, we briefly talked about the images_comparison folder, where visualisations are stored for all 24 (or any other specified number of) analysis steps. In my experience (English /l/ and /r/), it is often the case that formant tracking was inaccurate when F3 is extremely low for /r/ and F2/F3 is high for a very clear /l/. Just eyeballing all the comparison images will help you evaluate the formant tracking accuracy fairly quickly (especially when you’re a Mac user where you can just preview all the image files by pressing the space bar.)

When you would like to nominate a different analysis as a winner, you can tell FastTrack to return the tracking results for the particular analysis. This can be done by modifying the winners.csv – all you need to do is to simply type in and indicate which analysis is better. You can replace the tracking for all formants at once, or change the tracking of just one formant. Either way, don’t forget to change the number in the Edit column from 0 to 1.

Once you have nominated a different winner by yourself, you need to run Track folder again. But this time, untick the Track formants and Autoselect winners boxes at the bottom, as you’re simply telling FastTrack to use different analysis instead of tracking formants all over again.

5 Session info

sessionInfo()
## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS 15.3
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/London
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] emmeans_1.8.9   lmerTest_3.1-3  lme4_1.1-35.1   Matrix_1.6-1.1 
##  [5] emuR_2.4.2      knitr_1.45      lubridate_1.9.4 forcats_1.0.0  
##  [9] stringr_1.5.1   dplyr_1.1.4     purrr_1.0.2     readr_2.1.4    
## [13] tidyr_1.3.1     tibble_3.2.1    ggplot2_3.5.1   tidyverse_2.0.0
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6        xfun_0.50           bslib_0.7.0        
##  [4] lattice_0.21-9      numDeriv_2016.8-1.1 tzdb_0.4.0         
##  [7] vctrs_0.6.5         tools_4.3.2         generics_0.1.3     
## [10] pbkrtest_0.5.2      parallel_4.3.2      wrassp_1.0.4       
## [13] highr_0.10          pkgconfig_2.0.3     optimx_2023-10.21  
## [16] uuid_1.1-1          lifecycle_1.0.4     compiler_4.3.2     
## [19] farver_2.1.2        munsell_0.5.1       htmltools_0.5.8.1  
## [22] sass_0.4.9          yaml_2.3.8          pracma_2.4.4       
## [25] pillar_1.10.1       nloptr_2.0.3        crayon_1.5.2       
## [28] jquerylib_0.1.4     MASS_7.3-60         cachem_1.0.8       
## [31] boot_1.3-28.1       nlme_3.1-163        tidyselect_1.2.1   
## [34] digest_0.6.36       mvtnorm_1.2-5       stringi_1.8.4      
## [37] labeling_0.4.3      splines_4.3.2       fastmap_1.1.1      
## [40] grid_4.3.2          colorspace_2.1-1    cli_3.6.3          
## [43] magrittr_2.0.3      utf8_1.2.4          broom_1.0.5        
## [46] withr_3.0.2         backports_1.5.0     scales_1.3.0       
## [49] bit64_4.0.5         estimability_1.4.1  timechange_0.3.0   
## [52] rmarkdown_2.26      bit_4.0.5           png_0.1-8          
## [55] hms_1.1.3           coda_0.19-4.1       evaluate_0.23      
## [58] mgcv_1.9-0          rlang_1.1.4         Rcpp_1.0.14        
## [61] xtable_1.8-4        glue_1.8.0          DBI_1.1.3          
## [64] rstudioapi_0.15.0   vroom_1.6.5         minqa_1.2.6        
## [67] jsonlite_1.8.8      R6_2.5.1